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Abstract 


D 

C/)' 

o 
c 

We present a description of the theoretical framework and “best practice” for using 
the paleo-climate model component of the Coupled Model Intercomparison Project 
(Phase 5) (CMIP5) to constrain future projections of climate using the same models. 

5 The constraints arise from measures of skill in hindcasting paleo-climate changes from 
the present over 3 periods: the Last Glacial Maximum (LGM) (21 thousand years be- 
fore present, ka), the mid-Holocene (MH) ( 6 ka) and the Last Millennium (LM) (850- 
1850CE). The skill measures may be used to validate robust patterns of climate 
change across scenarios or to distinguish between models that have differing out- 
io comes in future scenarios. We find that the multi-model ensemble of paleo-simulations 
is adequate for addressing at least some of these issues. For example, selected bench- 
marks for the LGM and MH are correlated to the rank of future projections of precipita- 
tion/temperature or sea ice extent to indicate that models that produce the best agree- 
ment with paleoclimate information give demonstrably different future results than the 
is rest of the models. We also find that some comparisons, for instance associated with 
model variability, are strongly dependent on uncertain forcing timeseries, or show time 
dependent behaviour, making direct inferences for the future problematic. Overall, we 
demonstrate that there is a strong potential for the paleo-climate simulations to help 
inform the future projections and urge all the modeling groups to complete this subset 
20 of the CMIP5 runs. 

CD 

— I 

1 Introduction 

g 

C/)' 

The Coupled Model Intercomparison Project (Phase 5) (CMIP5) is an ongoing coor- 
dinated project instigated by the Working Group on Coupled Modelling (WGCM) at 
the World Climate Research Programme (WCRP) and consisting of contributions from 
25 over 25 climate modeling groups (and over 30 climate models) from around the world 
(Taylor et al., 2012). Multiple experiments are being coordinated, including historical 


CPD 

9, 775-835, 2013 


Using paleo-climate 
comparisons to 
constrain future 
projections in CMIP5 

G. A. Schmidt et al. 


Title Page 


Abstract H Introduction 


Conclusions H References 


Figures 



Full Screen / Esc 


Printer-friendly Version 
Interactive Discussion 



777 



















simulations for the 20th Century (starting from 1850), future simulations following mul- 
tiple Representative Concentration Pathways (RCPs) and crucially, for the first time 
in CMIP, three sets of paleo-climate simulations for the Last Glacial Maximum (LGM) 
(21 K Before Present (BP)), the Mid-Holocene (MH) ( 6 K BP) and the Last Millennium 
5 (850-1 850 CE). The paleo simulations are also part of the Paleoclimate Model Inter- 
comparison Project (Phase 3) (PMIP3) initiative. 

The CMIP5/PMIP3 paleo-simulations are true “out-of-sample” tests in that none of 
the models have been “tuned” to produce better paleo climates. This is not necessarily 
unwise (see Schneider von Deimling et al. (2006) for an example), but would complicate 
io some of the potential analyses. Because the same models are being used for both 
past and future simulations, this dataset is a unique resource for research into the 
connections between model skill and model predictions, and has the potential to greatly 
improve assessments of future climate change. 

There were many uncertainties in climate projections highlighted in the IPCC AR4 
is (Meehl et al., 2007). Many of these, such as the future of sub-tropical rainfall, El 
Nino/Southern Oscillation (ENSO) changes, potential declines in the North Atlantic 
meridional circulation, the fate of Arctic sea ice, etc. have important regional impacts. 
Reducing these uncertainties in the projections could have significant real world con- 
sequences for both adaptation and mitigation strategies. The three main classes of 
20 prediction uncertainty relate to: the choice of scenario, internal variability (sometimes 
described as initial condition uncertainty), and the imperfections in the model (or struc- 
tural uncertainty) (Hawkins and Sutton, 2009). Scenario uncertainties inevitably grow 
in importance with time particularly after about 30 yr due to the time-scales associated 
with economic change, C0 2 residence time and ocean thermal inertia. Initial condition 
25 uncertainty is globally important on scales of a few years (and longer at smaller spa- 
tial scales) but predictability is fundamentally limited by the chaotic dynamics of the 
atmosphere and upper ocean. Thus at the multi-decadal time-horizon, reducing and/or 
better characterizing structural uncertainty is the only way to reduce overall uncertainty. 
These structural uncertainties (given a specific scenario of future emissions and other 
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drivers) arise from a combination of model divergence - i.e. a large spread in model 
predictions given the same future scenario, and model inadequacy - i.e. models that 
are collectively either incomplete, inaccurate or are missing processes or feedbacks. 
The first effect is explicit (though not completely explored) in the multi-model ensemble, 
5 while the second is implicit and needs to be assessed independently. 

Observations provide the means to potentially test the models and reduce these 
uncertainties, but unfortunately, instrumental records of useful data targets are few 
(essentially limited to situ networks of temperature and rainfall prior to the satellite 
era), and perhaps more importantly, changes in the recent past are relatively small 
10 compared to projections for the future. Furthermore, the majority of skill metrics in 
historical (20th Century) simulations do not constrain future projections: models that 
are either good or bad at simulating some aspect of modern climate - the climatology, 
seasonal cycle, or interannual variability - often give essentially the same spread of 
future projections (Santer et al., 2010; Knutti et al., 2010). Paleo-climate changes from 
is the present offer a substantially larger signal and although paleo-climate records are 
often affected by substantial noise and difficulties in interpretation (Schmidt, 2010), the 
most robust reconstructions can provide a crucial test of model performance wider than 
the range of the 20th Century climates. 

There has been much evaluation of paleo-climate simulations via earlier incarnations 
20 of PMIP, as well as many individual studies (see the review by Braconnot et al., 2012, 
and references therein). However, there has been a lack of analyses that quantitatively 
link future simulations or forecasts with skill or sensitivity in the paleo-climate simu- 
lations (though see Hargreaves et al., 2012b, for an example). Partly this is because 
(prior to CMIP5) paleo-simulations were not done with exactly the same versions of 
25 the models being used for future projections and partly through a lack of suitable skill 
metrics for paleo-climate change. This paper is therefore specifically not focused on 
understanding paleo-climate change for its own sake but rather is meant as a guide to 
the appropriate theoretical framework for quantitatively linking past and future that can 
be applied to data from the CMIP5 archive. 
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For clarity in the rest of the text, we define the term “ensemble” to denote the full 
multi-model database of results across all scenarios (which here encompasses all pa- 
leoclimate, historical, idealized and future projection simulations). The future projec- 
tions used here consist of the four RCP scenarios (rcp26, rcp45, rcp6, rcp85) (future 
5 possibilities that roughly produce radiative forcing at the year 2100 relative to 2000 of 
2.6, 4.5, 6.0, and 8.5 Wrrf 2 , respectively) along with idealised simulations have been 
included to provide clean comparisons across models (such as 1 % increasing C0 2 
simulations, the response to an abrupt increase to 4 x C0 2 , atmosphere-only simula- 
tions etc.). For ease of reference, we will use CMIP5 to refer to the entire database, 
io including the PMIP3 simulations. Specific model simulations are referred to by their 
name in the CMIP5 database (i.e. rcp85, pastlOOO, Plcontrol etc.), while the scenarios 
or periods when are referred to more generally using a standard abbreviation or name 
(e.g. the LGM, MH, RCP 4.5). 

The scope of the paper is as follows: Sect. 2 discusses theoretical frameworks for 
is dealing with the multi-model ensemble, issues arising from the use of paleo-proxy data 
and the use of data-synthesis products; Sects. 3 and 4 discuss specific examples of 
skill metrics that may have predictive power in future simulations by showing robust 
behaviour across paleo and future experiments, or discriminate between future projec- 
tions. Sect. 5 presents some exploratory analysis of additional potentially useful met- 
20 rics that either diverge over time or are too sensitive to important uncertainties; Sect. 6 
concludes and discusses the potential for further work in this area. We list the models 
that we have used in analyses in this paper, along with the specific experiments and 
simulation IDs, in Table 1. 
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2 Methodologies 

O 

C 

2.1 Palaeoclimate reconstructions 

o 

Many of the problems in dealing with reconstructing climate from paleodata are specific 
to the type of record, the time period and resolution concerned - for instance, annually 
resolved tree rings have issues distinct from lower resolution ocean sediment or pollen 
records, (e.g. Kohfeld and Harrison, 2000; Ramstein et al., 2007; Jones et al., 2009; 
Harrison and Bartlein, 2012). There are however a number of general issues that affect 
the use of such data for model evaluation, including the potential for multiple climate 
controls on a given record, the scale over which they are representative, the need 
to quantify (and take into account) reconstruction uncertainties, and the sparse and 
uneven site coverage. 

Records used for palaeoclimate reconstructions are in general influenced by several 
different aspects of climate as well as, potentially, non-climatic factors. For instance, 
oxygen or hydrogen isotopes from ice cores, carbonates or organic matter are phys- 
ically meaningful variables, but do not necessarily have a one-to-one stationary re- 
lationship with temperature or precipitation (e.g. Werner et al., 2000; Schmidt et al., 
2007; Masson-Delmotte et al., 2011). Vegetation, in addition to being influenced by 
several aspects of seasonal climate, is directly influenced by the atmospheric C0 2 
concentration (Prentice and Harrison, 2009). There are several approaches that have 
been adopted to overcome this type of problem: the use of multi-proxy reconstruction 
techniques, forward modeling of the system within a climate model or using climate- 
model output (see an example related to coral carbonate isotopes in Sect. 5.1), and 
model inversion or data assimilation. Multi-proxy reconstructions rely on the idea that 
different types of record will be sensitive to different aspects of climate, and that pool- 
ing the information from each of these records therefore provides a more robust re- 
construction of any specific climate variable. In the sense that forward modeling (and 
by extension model inversion techniques) are based on physical and or physiological 
knowledge of the given system, the use of these approaches may be a more robust 
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way of dealing with the non-stationarity issue - however, as with climate models, the 
results are constrained by the quality of the models and the degree to which the sys- 
tem is well-understood (see for example the discussion of C0 2 fertilisation in Denman 
et al„ 2007). 

5 The scale over which a record is representative can be a major issue in comparing 
paleodata and model output. All types of records are responding to local conditions, 
and for basic meteorological variables it is rare for a record to be representative for 
spatial scales of more than 50-1 00 km (though many records, such as tropical ice core 
<5 18 0, may have strong correlations to climate further afield; e.g. Schmidt et al., 2007). 
io Comparisons at these scales often require some form of dynamical or statistical down- 
scaling of model output, though there are many associated issues (Wilby and Wigley, 
1997). Alternatively, up-scaling reconstructions (for instance, through the use of grid- 
ding) can often reveal large-scale patterns that models could be expected to resolve, 
although this requires a sufficiently dense network of sites. Recent developments in- 
is elude the use of cluster analysis to classify types of model behaviour and to determine 
cohesive regions for comparison with the large-scale patterns in the observations (e.g. 
Bonfils et al., 2004; Brewer et al., 2007; Harrison et al., 2013). 

Paleoclimate reconstructions are usually accompanied by estimates of measurement 
or statistical uncertainty. However, in past practice these uncertainties were rarely prop- 
20 agated into large-scale synthetic products (except in terms of non-quantitative quality 
control measures, see e.g. COHMAP, 1988) and even more rarely taken into account 
when the reconstructions were used for model evaluation. However, quantitative mea- 
sures of uncertainty have been included in more recent palaeoclimate syntheses (e.g. 
MARGO, 2009; Bartlein et al., 2011) and the use of fuzzy-distance measures (Guiot 
25 et al., 1999) provides an explicit way to take account data uncertainties in data-model 
comparisons. It is worth noting that model-data differences cannot be expected to be 
smaller than the data uncertainties themselves. 
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2.2 Paleo-modelling issues 

O 

C 

There are two particular issues that are more problematic in paleoclimate simulations 
than, for instance, simulations of the 20th Century: model drift and forcing uncertainty. 

The issue of coupled climate model drift arises because of the long (~ thousands 
5 of years) time required to bring the deep ocean into equilibrium in coupled ocean- 
atmosphere models. In some cases, insufficient spin-up time may have been allowed 
before specific experiments are started. While drift also affects transient historical sim- 
ulations, the relatively large forcings in the 20th Century mean that residual drift is 
usually a small component of the transient response. For simulations of the last millen- 
io nium though, the forcings are much smaller, and drift in the early centuries of the simu- 
lation will be a larger fraction of the modelled change (Osborn et al., 2006; Fernandez- 
Donado et al., 2012). One proposal to deal with this is via a correction using the drift 
in the control simulation (i.e. calculating a smooth trend and removing it from the per- 
turbed simulation prior to analysis). While this works well for temperature, it is not very 
is good for variables that exhibit threshold behaviour such as sea ice extent or precipita- 
tion. In practice, this issue needs to be assessed for each proposed comparison. 

Secondly, there are important uncertainties in the forcings used for the paleoclimate 
experiments. This is also true for aerosols in the 20th Century simulations, for instance, 
but such issues are more prevalent in paleo-simulations. For example, both the mag- 
20 nitude of solar or volcanic forcing over the last millennium, and the size and height of 
ice sheets at the LGM are sources of major uncertainty. In the last millennium exper- 
iments, multiple forcing choices were proposed (Schmidt et al., 2011, 2012), but few 
groups have attempted (as yet) to comprehensively explore all the options, and this is 
also true for uncertainties associated with other time periods. If an insufficient range of 
25 different forcings is tested, it is plausible that mismatches between observations and 
simulations may be wrongly attributed to the model (or observations), when in fact they 
was related to a mis-specified forcing (e.g. Kageyama et al., 2001). 
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It should also be noted that multi-model ensembles are not a controlled sample from 
a well-defined distribution of plausible simulations. Models are necessarily incomplete 
and there are common biases that have more to do with the state of computational 
technology than physics (for instance, poor or non-existent resolution of ocean ed- 
dies). Multi-model ensemble means can be informative and will generally outperform 
individual models (Annan and Hargreaves, 2011), but care must be taken to assess 
the suitability of each included model and weighting of individual models needs to be 
well justified (Knutti et al., 2010). 

C/)‘ 

2.3 Approaches to comparing reconstructions and simulations 

C n 

There has been a gradual evolution in the approaches for comparing reconstructed 
changes and simulations from essentially qualitative graphical comparisons of model 
output and reconstructions of the corresponding climatic variables (e.g. Braconnot 
et al., 2007) to more quantitative approaches that measure model- data mismatch via 
some “metric” or distance function (e.g. Sundberg et al., 2012; Izumi et al., 2013). 
Metrics based on correlations or RMS differences between fields of modern data 
and model output have been commonly used in model evaluation (e.g. Taylor, 2001; 
Schmidt et al., 2006; Gleckler et al., 2008). These methods provide opportunities for 
both inter- and intra- generational model comparisons (Reichler and Kim, 2008; Harri- 
son et al., 2013). 

Focusing on the collective performance of the ensemble as a whole, Hargreaves 
et al. (2011) tested the ability of the PMIP2 ensemble to represent the Last Glacial 
Maximum in terms of its “reliability”; defined as the adequacy of the ensemble, consid- 
ered in probabilistic terms, in predicting the changes documented in the paleo-climate 
archives during that interval. The concept of “skill” as adopted in the numerical weather 
prediction community is also useful as a quantitative test of model performance: that is, 
does a model produce a more accurate prediction (match to the paleo-climate record), 
than that which would be achieved by a simple null hypothesis? (Hargreaves et al., 
2012b). While many studies have focused on time-slice or time-series comparisons, 
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nothing precludes comparing the simulations and paleo-record in the frequency do- 
main. Recent work, has looked at the fluctuations in forcings and data as a function 
of timescale, and in principle, these fingerprints could also be useful (Lovejoy and 
Schertzer, 2012). 

a> 

T3 

2.4 Linking past and future 

There are two main ways in data-model comparisons can be used as a guide to the 
future - either as a validation of a robust relationship across models and scenarios, 
or as a method to discriminate between different models. A prerequisite for the latter 
example is that the metric chosen actually correlates to future outcomes within the 
ensemble. If this is not the case, then the metric is orthogonal to the spread in the 
projections and cannot be used to constrain it. Even when such a relationship is found, 
we need to consider whether it is physically meaningful to be confident that it has not 
arisen either though chance due to a small sample size or as an artifact of the model 
or the experimental design. While connections may in principle be highly complex, it is 
natural as a first step to consider whether a correlation exists between past and future 
behaviour in the same diagnostic. The search for useful metrics (in this sense) using 
modern data has generally been disappointing (Knutti et al., 201 0), although there have 
been a small number of cases where apparently meaningful relationships have been 
found (Boe et al., 2006; Hall and Qu, 2009; Fasullo and Trenberth, 2012). It is notable 
that the first two examples relate future climate changes to externally-forced changes 
in the modern climate (relating to decadal trend, and seasonal range, respectively), 
rather than using metrics based on the climatological mean state alone. This lends 
support to our working hypothesis that past variations seen in paleoclimate simulations 
might also be informative about the future as well as increasing understanding about 
the past. 

Where a credible relationship between past and future is found, there is a range of 
methods that can be applied to use observations to constrain future predictions (Collins 
et al., 2012). One method, applied by both Boe et al. (2006) and Hall and Qu (2009), 
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is to take the observational estimate, and use the relationship (often linear) embodied 
in the correlation of model output to project this value into the future. An attractive fea- 
ture of this approach, beyond its simplicity, is that it readily allows extrapolation of the 
observed relationship in the case where the true value is suspected of lying outside 
5 the model range. An alternative approach, which has been widely applied to perturbed 
physics ensembles is more explicitly Bayesian, considers the ensemble as a probabilis- 
tic sample. For the prior, equal weight is typically assigned to each ensemble member. 
Probabilistic weights are then calculated for each member of the ensemble, according 
to their performance in reproducing the observations. This weighted ensemble now rep- 
10 resents the posterior estimate of future change. This method uses the model spread 
as a prior constraint, which depending on one’s viewpoint, and the specific case in 
question, may be considered either a strength or weakness of this approach. These 
and other methods are discussed in more detail in Collins et al. (2012). 

—5 

3 Robust metrics 

g 

(/)' 

is In this section we highlight physically-based correlations between key metrics that show 
similar patterns in the paleo-climate runs and in future projections (or more idealised 
scenarios). With evaluation via the paleo-climate record, these metrics can be consid- 
ered robust, and thus provide contingent predications of one variable given a potential 
change in the other. 

20 3.1 Patterns of regional climate change vs. global means 

g 

The main climate forcings for the LGM are the lower concentrations in atmospheric 
greenhouse gases and the presence of Laurentide and Fenno-Scandinavian ice-sheets 
in the northern extratropics. The ice sheets have a strong local albedo effect (e.g. Bra- 
connot et al., 2012) but also affect the mid-latitude large-scale atmospheric circulation 
25 due to the associated change in topography (e.g. Pausata et al., 2011; Riviere et al., 
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2009; Lame et al., 2009). However, away from this perturbation for the atmospheric 
radiative budget and for the atmospheric dynamics, we expect that the greenhouse 
gas forcing would be the main forcing for the LGM climate change. There could then 
be a relationship between LGM climate change and future climate change for a given 
5 model, which could be useful in testing the ability of climate models in reproducing 
regional climate change relative to the global change. 

Figure 1 shows the results comparing the mean annual surface air temperature 
change over a region compared to the global mean change for the abrupt 4xC02, 
1pctC02 and Igm CMIP5 simulations across a suite of models. We have considered 
io the tropics (land + oceans) and the tropical oceans, which have been used previously in 
perturbed physics ensemble studies (Schneider von Deimling et al., 2006; Hargreaves 
et al., 2007), East Antarctica, for which the temperature change is shown to scale with 
global temperature change for the LGM and the CMIP3 2xC02 and 4xC02 changes 
(Masson-Delmotte et al., 2006a, b) and the well documented mid-latitude region of the 
is North Atlantic and Europe. 

For the tropics, for land and ocean points as well and ocean points only, Fig. 1 shows 
that the relationship between the regional and global temperature change exists for the 
1pctC02 and abrupt 4xC02 anomalies, and is consistent across these two experi- 
ments. Such a relationship also exists for the LGM, but the slope is clearly lower than 
20 for the increased C0 2 experiments. Furthermore, the models which simulate the small- 
est warming for increased C0 2 are not those which simulate the smallest cooling for 
LGM (and similarly for the models with the largest warmings and coolings). The re- 
gional vs. global temperature change relationship appears more consistent between 
LGM and increased GHG forcings for East-Antarctica and, surprisingly, over the North 
25 Atlantic/Europe region. However, for the tropics, rankings of the models according to 
their cooling for LGM and warming for 1pctC02 and abrupt 4xC02 are not consistent. 
This shows that either the impact from the lower GHG concentrations are not symmet- 
ric compared to those for increased GHG concentrations, or that the ice-sheet remote 
impact extends to the tropics (LaTne et al., 2009). 
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3.2 Land-ocean contrasts 

O 

C 

Model results have consistently shown that for the LGM, the continents cooled more 
than the ocean (e.g. Braconnot et al., 2007, 2012; LaTne et al., 2009), while, in a sym- 
metric manner, predictions for future climate show a stronger warming over land than 
5 over the oceans (e.g. Sutton et al., 2007; Drost et al., 201 1 ). The ratio between cooling 
over land and cooling over the ocean for the LGM tropics was ~ 1.3 in the PMIP1 com- 
puted sea surface temperature (SST) simulations (Pinot et al., 1999), a result close to 
the ratio of ~ 1 .5 found for the PMIP2 fully coupled LGM experiments (Braconnot et al., 
2012) and conspicuously close to the 1.5 ratio found by Sutton et al. (2007) for future 
io climate. 

This relationship also holds in the most recent CMIP5 simulations (Fig. 2) not only for 
the tropics but also for the well-documented region of the North Atlantic and Europe, 
consistent with the LGM data. It is worthwhile to note that this pattern was previously 
used to highlight the inconsistency in an earlier compilation of tropical LGM sea surface 
is temperatures (Rind and Peteet, 1 985). We conclude that these relationships are indeed 
robust, although they appear imperfectly understood (Lambert et al., 2011). 

C 

(/) 

3.3 Regional extremes 

Extreme climate events such as heatwaves and cold spells can have long lasting im- 
pacts on society or ecosystems (IPCC SREX, 2012). The development of such events 
20 spans days to a few weeks, so that they are largely intra-seasonal by nature (Senevi- 
ratne et al., 2012). In such a context, the generally linear relationship between recon- 
structions and actual climate can be strongly distorted. Hence, since extreme events 
are by definition rare, large numbers of examples are required to get good statistics. 
Simulations of the past millennium offer a promising tool to investigate modeled ex- 
25 tremes since they sample a large range of possible cases. The strongest limitation for 
an application of this method to paleoclimatic data has been the necessity of dealing 
with daily data in order to capture behavior that is non-Gaussian and the need for proxy 
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data that record extreme variables (Jomelli et al., 2007). However, if we can demon- 
strate the robustness of the relationships between short and longer term statistics over 
long periods of time, and/or their dependence on external forcings, we can potentially 
predict the behavior of temperature extremes in the future. 

5 The statistical analyses of (daily) temperature hot extremes of the 20th century have 
shown that temperature is generally a bounded variable, for which the upper bound can 
be computed from the statistical parameters of extremes (Parey et al., 2010a,b). Diag- 
nostic studies focusing on the probability distribution of temperature and precipitation 
extremes are often based on the application of Extreme Value Theory (EVT), though 
io simpler metrics have also been used (e.g. Hansen et al., 2012). EVT describes the 
behavior of the probability distribution near the tails, and allows one to compute return 
levels for return periods that are longer than the period of observation (Coles, 2001). 
It has been applied to meteorological observations (Parey et al., 2010a), reanalysis 
data (Nogaj et al., 2006) and model simulations (Kharin et al., 2005, 2007) in order to 
is quantify trends of extremes. 

It has also been shown that the extremes of hot and cold temperatures are correlated 
with mean temperatures over the northern extra-tropics (Yiou et al., 2009). Until now, 
few models had provided daily output of temperature or precipitation on multi-century 
timescales (Jansen et al., 2007). However, with increasing storage capacity, daily reso- 
20 lution data is becoming more common and was requested for simulations in the CMIP5 
archive (Yiou et al., 2012). 

In the extra-tropics, seasonal summer heatwaves are generally preceded by 
droughts in the Winter-Spring seasons (Fischer et al., 2007; Vautard et al., 2007) with 
a mechanism that involves a positive feedback between sensible heat fluxes, evapo- 
25 transpiration and temperature (Schar et al., 1999) and this has also been found in 
global and regional models for future projections (Seneviratne et al., 2006, 2010; Que- 
sada et al., 2012). A useful statistical metric to connect winter-spring precipitation and 
summer temperature is quantile regression. Ordinary least-squares regression focuses 
on the mean values of variables to be connected but by setting a threshold based on 
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high (or low) quantiles of the variable to be predicted, one can build regression coeffi- 
cients conditional to high (or low) values of this variable (Koenker, 2005). We illustrate 
this diagnostic in Fig. 3, by computing the quantile regression for 90th and 10th quan- 
tiles of the summer hot day frequency and winter-spring precipitation frequency in the 
5 IPSL-CM5A-MR historical simulation and the E-OBS gridded dataset (Haylock et al., 
2008). The quantile regression slopes illustrate the asymmetry of the precipitation or 
temperature dependence for hot or cool summers in Western Europe (Hirschi et al., 

2011; Mueller and Seneviratne, 2012; Quesada et al., 2012; Seneviratne and Koster, 
2012 ). 

io The general picture is that a dry winter/spring tends to favor a hot summer. But while 
wet winter-spring conditions are generally followed by cool summers (small spread be- 
tween low and high quantiles), dry winter-spring conditions can be followed by cool 
summers as well as heatwaves (large spread between low and high quantiles), be- 
cause the genesis of heatwaves can be broken in just a few days, due to fast variations 
is of the synoptic atmospheric circulation (Hirschi et al., 2011; Quesada et al., 2012). This 
feature has been tested on CMIP3 and some CMIP5 simulations for the present and 
A2/rcp85 scenarios. It was shown that the seasonal predictability of large European 
heatwaves decreases under warmer conditions, although their frequency increases 
(Quesada et al., 2012). 

20 There have been many studies compiled by historians focusing on European heat- 
waves in recent centuries and their impacts on society (Le Roy Ladurie, 2004, 2006; 
Barriendos and Rodrigo, 2006; Camuffo et al., 2010). Hence, using a metric to cap- 
ture heatwave dynamics is a promising approach to investigate major heatwaves that 
struck Europe during the last millennium, and to explore the relationship between sum- 
25 mer temperature and winter-spring precipitation preconditioning, with different climate 
forcings, especially land use, though this remains a work in progress. 
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4 Discriminating metrics 

O 

C 

In this section we highlight metrics for which we have paleoclimate information that 
serve to discriminate between models that show different behaviours in future projec- 
tions (or more idealised scenarios). 

0 

“5 

4.1 Rainfall change in South America 

Projections of precipitation change in South America have a large spread in the CMIP3 
archive (Meehl et al., 2007). In future projections, most models simulate a dipole of 
precipitation change in Northern South America, but the sign of this dipole depends on 
the model. If this feature is an intrinsic response in each model to a forcing, it might be 
possible to evaluate the dipole response in the paleo-climate simulations. 

We define the precipitation dipole as the annual-mean precipitation averaged over 
0°-8° N-60° W-50°W minus the annual-mean precipitation averaged over 5° S-15° S- 
45° W-35° W. Over the 16 models examined here, the models that have the 5 lowest 
values for rcp85-piControl precipitation dipole change are classified as group 1; those 
with the 5 highest values are classified as group 3, and the remainder, in group 2. The 
group 1 models simulate drier Guyana, Venezuela and Colombia, and wetter Nordeste 
and Eastern Brazil, associated with a Southward shift of the Inter-Tropical Convergence 
Zone (ITCZ). The group 3 models simulate a wetter Venezuela and drier Eastern Brazil, 
associated with a Northward shift of the ITCZ. 

Figure 4 shows a strong link between precipitation changes in the future and pre- 
cipitation changes in the MH. Models in group 1 show a dipole in the MH which is 
similar to the dipole they simulate in the future, with a strong Southward shift of the 
ITCZ. In contrast, models of group 3 show instead a broadening of the ITCZ in the 
MH. Therefore, paleo-proxies of precipitation along the South American coast could 
help determine which group of models is the most realistic in the MH, and, by ex- 
tension, which simulation of future change has greater credibility (Silva Dias et al., 
2009) The Prado et al. (2013) paleo-data synthesis for South America suggests drying 
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everywhere except over Northeastern Brazil, a pattern that is most consistent with that 
simulated by the group 1 models. 

To gain confidence in such a paleo-constraint, we need to understand the physical 
processes that explain the common behavior between past and future. This prelimi- 
5 nary analysis will not fully answer this question, but it does illustrate how to make use 
of the wealth of past, future and idealized CMIP5 simulations. Table 1 shows a se- 
lection of correlations between precipitation changes and other model features. First, 
in the future climate, shifts in the ITCZ seem to be associated with shifts in the SST 

o 

dipole in the Atlantic: models that shift the ITCZ the most southwards are those with 
io the strongest warming south of the Equator relative to the rest of the Atlantic. ITCZ 
shifts in response to SST dipoles are expected (e.g. Kang et al., 2008). However, this 
relationship does not seem to hold for the MH to PI change. Second, the atmospheric 
component of the model also appears to play a key role. Some of the different model 
behaviors can be seen in amipFuture simulations, where all models are forced by the 
is same pattern of SST warming. In addition, much of the different model behaviors can 
already be seen in sstClim4xC02 simulations, where a quadrupling of C0 2 is imposed 
with SST held constant. This is consistent with the fast response to C0 2 being an 
important component of the total precipitation response in global warming (e.g. Bala 
et al., 2009). Models that decrease precipitation over Northern South America in the 
20 projections and in the MH are those that decrease precipitation over this region under 
4 x C0 2 . They also happen to be the models with the strongest land surface warming in 
response to both 4 x C0 2 and to MH forcing. Therefore, the different groups of models 
show different precipitation response to SST changes, orbital forcing and to 4 x C0 2 , 
but the response shows similarity between all these different forcings and within each 
25 model group. This suggests that common mechanisms are involved in the precipitation 
response to all forcings, and that this is representative of each individual model. Finally, 
it is worth noting that models in group 3 often show the most significant “double ITCZ” 
problem in the Atlantic, an obvious, and persistent, common model bias. 
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4.2 LGM constraints on climate sensitivity 

o 

c 

The LGM has been a prime target for assessments of climate sensitivity since it is 
a quasi-stable period with significant climate differences from today, with reasonably 
well-known boundary conditions and sufficient data to reconstruct large-scale climate 
5 shifts (e.g. Lorius et al., 1990; Edwards et al., 2007; Kohler et al., 2010; Schmittner 
et al., 201 1 ; PALAEOSENS, 201 2). 

We can apply the methods described in Sect. 2 to estimate of the equilibrium climate 
sensitivity based on the CMIP5 LGM simulations. We use an ensemble of opportu- 
nity consisting of 7 models which participated in the PMIP2 experiment, together with 
io 4 CMIP5 models for which sufficient data are available (at time of writing). Estimates of 
the climate sensitivities of these models were obtained from a variety of sources and 
were derived using a range of methods. For the PMIP2/CMIP3 models, sensitivity was 
generally calculated using a slab ocean coupled to the atmospheric component (Meehl 
et al., 2007), whereas in CMIP5, the most readily available estimates use a regression 
is based on a transient simulation (Andrews et al., 2012). These estimates are not per- 
fectly commensurate, with some models reporting a 10% difference in the two meth- 
ods (Schmidt et al., 2013). Some of the PMIP2 models used for the LGM simulations 
may also differ from the equivalent CMIP3 versions for which the sensitivity estimates 
were made. Thus, the values used here may be somewhat inconsistent and imprecise, 

20 although we expect the uncertainty arising from these sources to be modest in compar- 
ison to the range of values represented across the ensemble. The boundary conditions 
for the LGM simulations are essentially unchanged between PMIP2 and CMIP5 (save 
for changes in the shape of the imposed ice sheets), allowing us to consider these ex- 
periments as broadly equivalent (though there are some systematic biases, Kageyama 
25 et al., 2012). Limitations in the boundary conditions (such as the exclusion of dust 
and vegetation effects) may, however, introduce additional bias and uncertainty into 
our result, which we do not attempt to account for here. For these and other reasons 
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discussed below, these results should be considered as a proof of concept rather than 
conclusive. 

C/) 

The LGM was associated with a large negative radiative forcing anomaly with respect 
to the pre-industrial including substantially lower concentrations of greenhouse gases 
5 (e.g. Kohler et al., 2010). However, the ensemble does not show an expected negative 
correlation between climate sensitivities and their globally averaged LGM temperature 
anomalies (over the full lOOyr of simulation output) (Fig. 5a, see also Crucifix, 2006). 
There is a strong negative correlation in the tropics, most strongly in the latitude band 
10°S-30°N (Fig. 5b) (Hargreaves et al., 2012a). The correlation is weaker at higher 
io latitudes where the feedbacks in response to large cryospheric changes may be very 
different to those exhibited in a future warmer climate. There is also a strong positive 
correlation in the southern ocean (i.e. , colder LGM anomalies are linked with lower 
sensitivity), possibly due to a large range of biases in the control climate (Fig. 5c). 

The correlation of piControl temperatures to sensitivity points to the Arctic and the 
is southern oceans as regions where base climatology impacts sensitivity, probably via 
cloud effects (see Trenberth and Fasullo, 2010, for a discussion). The strong negative 
correlation ( r = -0.8) between the LGM temperature anomalies in the latitude band 
10°S-30° N, and the climate sensitivities of the models (Fig. 6), is physically plausible, 
since this region is far from the cryospheric and sea ice changes of the LGM, and the 
20 forcing here is dominated by the reduction in greenhouse gas concentrations. 

If we assume that the correlation with tropical temperatures provides a valid con- 
straint on the real climate system, we can use this correlation to project an observa- 
tional estimate of the past change onto the future, as in Boe et al. (2006). Recently, An- 
nan and Hargreaves (2012) generated a new estimate of LGM temperature changes, 

25 based on a combination of several multiproxy data sets, and the ensemble of PMIP2 
models. The method does not depend on the magnitude of changes estimated by the 
models, but only their spatial patterns. Using the resulting estimate of LGM tempera- 
ture change in this latitude band of -2.2±0.7°C (at 90% confidence), the predicted 
value for climate sensitivity arising from the correlation is 2.7 °C, with a 90% interval of 
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1. 2-4.1 °C calculated by Monte Carlo sampling, but this range is somewhat sensitive 
to the reconstruction uncertainties. 

C/) 

For a more explicitly Bayesian approach, we initially assign equal probability to each 
model in the ensemble. This choice can be questioned, given both the range of model 
5 complexities, and also the possible inter or intra-generational similarities between mod- 
els of related origins (Masson and Knutti, 2010). However, quantifying these issues is 
far from straightforward, so we make our choice for reasons of practicality and in order 
to demonstrate the utility of the overall method. A standard kernel density estimation 
based on the ensemble leads to the prior distribution presented as the green curve in 
io Fig. 7, which has a 90 % range of 1 .7-4.9 °C and a mean of 3.4 °C. The observationally- 
derived estimate of tropical temperature gives rise to the natural likelihood function 
L{M\0) = P{0\M) from which the weights are calculated (where O represents the ob- 
servations, M the model simulation, L(M\0) the likelihood of the model result given the 
observed data, and P{0\M) the probability of the observations assuming that the mod- 
15 els are correct). The posterior distribution is shown in red, the bulk of which has been 
shifted to lower values with the mean reducing to 2.8 °C. Its 90% probability range has 
only moved slightly, however, to 1.6-4. 7 °C. The reason for the upper limit here remain- 
ing high is that the highest sensitivity model in the ensemble has been assigned a fairly 
large weight since it matches the reconstructions well. The small size of the ensemble 
20 means that this approach is rather sensitive to the presence or absence of particular 
models in the ensemble. 

The two approaches differ considerably in their use of the model ensemble. In the 
latter case, the ensemble is directly used as a prior estimate, which therefore imposes 
quite a strong constraint on climate sensitivity even before these observational con- 
25 straints are used. The former method may be considered as roughly equivalent to us- 
ing a prior that is uniform in the observed variable (here tropical temperature), although 
this approach is rarely presented in explicitly Bayesian terms. Despite the different as- 
sumptions and approaches, these methods both generate rather similar estimates for 
the climate sensitivity - both assigning highest probability towards the lower end of the 
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model range. The ranges are comparable with other paleo-climate derived estimates 
of 2.3-4.8°C (68% confidence interval, PALAEOSENS, 2012) but, given the small en- 
semble size and possible naivety of the assumptions made here, these estimates may 
not be robust and need to be tested using a larger ensemble. 

4.3 Arctic Sea ice sensitivity constraints from the mid-Holocene 

The rate and pattern of Arctic sea ice change in the future is of key scientific interest due 
both to the surprisingly rapid changes currently occurring and the large spread in model 
estimates in, for instance, the onset of summertime “ice-free” conditions (Stroeve et al., 
2012; Massonet et al., 2012). 

Recent studies (Mahlstein and Knutti, 2012; Abe et al., 201 1 ) have demonstrated that 
biases in sea ice volume have a strong impact on the simulated responses to radiative 
perturbations, and that there maybe a possibility to discriminate among models based 
on interannual modes of variability. The mid-Holocene simulations (driven mainly by 
changes in orbital forcing) may provide a orthogonal test of Arctic sea ice sensitivity. 
MH insolation changes imply that NH summers were warmer than summers today 
(see Kutzbach, 1 981 , and many subsequent papers). Paleo-data from the circum-Arctic 
region indicates that this warmth was accompanied by reductions in sea ice extent at 
least during some months of the year (Dyke and Savelle, 2001; de Vernal et al., 2005; 
McKay et al., 2008; Funder et al., 201 1 ; Polyak et al., 201 0; Moros et al., 2006). 

The CMIP5 MH simulations (Fig. 8) consistently show decreases in sea ice extent 
from August through to November. Changes in winter months are not coherent across 
the models, though these changes are not well characterised in the paleo-data either. 
There is a relationship (Fig. 9) between the size of the anomaly at the MH and in fu- 
ture projections, presumably reflecting the underlying sensitivity of the sea ice model 
and Arctic climate in general (see also O’ishi and Abe-Ouchi, 2011). This correlation 
exists despite the variations in the cause of the ice loss (summer insolation versus 
greenhouse-gas-related forcing). Although, the small size of the ensemble raises ques- 
tions of robustness of the relationships, it should be possible to use the MH ice extent 
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anomaly to estimate the likely loss in future projections. However, it may also be pos- 
sible to use more specific or local diagnostics to compare to a wider proxy network for 
a similar constraint (Tremblay et al., 2013). 

"0 

Q) 

5 Exploratory metrics and limitations 

5 In this section we provide examples of where the paleo-climate information is ambigu- 
ous, or where connections seen in paleo-climate changes do not translate into the 
future for some reason. This may be related to forcing ambiguities, climate-change re- 
lated divergence, or potentially, a misunderstanding of the dominant processes. While 
these examples are not directly informative about the future, they illustrate how the 
io limitations of our outlined approaches can be explored in ways that illuminate key un- 
certainties. 

5.1 20th-century changes in tropical Pacific climate 

g 

The response of the tropical Pacific Ocean to anthropogenic climate change is un- 
certain, partly because we do not fully understand how the region has responded to 
is anthropogenic influences during the 20th century. Instrumentally based estimates of 
SST do not depict an internally consistent view (Deser et al., 2010), and model simu- 
lations similarly disagree regarding the 20th-century trend (Thompson et al., 2011). 
Understanding trends in the tropical Pacific is particularly challenging because the 
instrumental record is sparse even for the early 20th Century and long-term in situ 
20 measurements of SST are uncommon. High-resolution paleoclimate records, particu- 
larly the large network of tropical Pacific coral <5 18 O ca | Cite records, can be used in con- 
junction with the observational record and help interpret tropical climate trends. These 
proxy records respond to the combined effects of SST and the isotopic composition 
of seawater (<5 18 O sw ) (which is strongly correlated to sea surface salinity, SSS) and 
25 can reveal changes on longer time scales. To address the limitations of each individual 
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archive, Thompson et al. (2011) proposed using a forward-modeling approach to gen- 
erate synthetic coral records (i.e., pseudocorals) from observational and climate model 
output and test whether these pseudocorals are in agreement with the network of coral 
<5 18 O c observations. If they agree with the <5 18 O c records, then the causes of change 
in the region may be inferred, while disagreement may reveal key uncertainties in the 
data. 

The coral <5 18 0 model from Thompson et al. (2011) calculates isotopic variations as 
a function of SST and SSS, with an SST-<5 18 O c slope of -0.22%o°C _1 and the SSS- 
<5 18 O sw slope varying by region (LeGrande and Schmidt, 2006). When driven with his- 
torical SST and SSS data, this simple model of <5 18 O c was able to capture the spatial 
and temporal pattern of ENSO and the linear trend observed in 23 Indo-Pacific coral 
records between 1958 and 1990 (Thompson et al., 2011). The observed trends were 
driven primarily by warming at the coral sites, though SSS trends were responsible for 
approximately 40 % of the shared <5 18 O c trend. These results not only indicate a signif- 
icant SSS trend in the tropical Pacific, but also the importance of <5 18 O sw in simulating 
the observed <5 18 O c . 

However, pseudocoral records calculated from CMIP3 historical simulations did not 
reproduce the magnitude of the secular trend, the change in mean state, or the change 
in ENSO-related variance observed in the coral network from 1890 to 1990. Similarly 
large discrepancies are present between CMIP5 simulations and the observations, 
with none of the individual CMIP5 pseudocoral networks producing trends as strong 
as in the observed 20th Century coral records. While the observational coral network 
suggests a reduction in ENSO-related variance and an El Nino-like trend over the 20th 
century, CMIP3 and CMIP5 simulations vary greatly on both points. 

The differences between observed and GCM-derived <5 18 O c trends may stem from 
the simplicity of the forward model for <5 18 O c , bias in the coral records, and/or errors 
in the GCM SST and SSS fields. In particular, the potential role of non-climatic trends 
in <5 18 O c and the magnitude and spatiotemporal pattern of the <5 18 O sw -SSS relation- 
ship needs to be further investigated. Preliminary tests with data from isotope-enabled 
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coupled control simulations (LeGrande and Schmidt, 2011) support a minor role for 
short term isotope variability, though it remains to be seen if longer-term trends in the 
20th Century simulations are important. Previous work has highlighted potential biases 
in simulated salinity fields as a potential source of the observed-simulated discrepancy 
5 (Thompson et al., 2011, 2012). For example, CMIP3 and CMIP5 simulations display 
weak and spatially heterogeneous SSS trends, such that the magnitude of the <5 18 O c 
trend in CMIP3 and CMIP5 simulated pseudocorals is indistinguishable from the trends 
observed in individual centuries of an unforced control run (Fig. 10, upper panel). We 
also find that the trends in mean state and change in ENSO-related variance within 
io the basin are highly variable among the CMIP5 models, and even between ensemble 
members of the same model. On the other hand, while pseudocorals, modeled from the 
new SODA 20th-century reanalysis of SST and SSS, display greater agreement with 
the observed coral trends, two recent versions of this product disagree regarding the 
relative contribution of SST and SSS. These results suggest that more work is needed 
is to constrain the magnitude of the observed 20th-century salinity trend throughout the 
tropical Pacific Ocean. 

Despite the disagreement among models and runs regarding the change over the 
20th Century, the CMIP5 projections converge upon a more El Nino-like (e.g. warmer 
eastern equatorial Pacific) mean state change by 2100 under RCP 4.5 (with only one 
20 model suggesting the opposite), consistent with the CMIP3 projections (Meehl et al., 
2007). However, the models still disagree about the change in ENSO-related variance. 
Further, there is no clear relationship between the magnitude of the simulated 20th- 
century <5 18 O c trend and the projected future <5 18 O c trend in the CMIP5 ensemble 
(Fig. 10, lower panel). This suggests that an agreement of the simulated 20th-century 
25 change in the tropical Pacific with that of the observational coral network would not be 
a reliable indicator of future trends. 
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5.2 Spectra and fluctuation analyses 

o 

c 

As mentioned above, there is no restriction on what kind of variables, means, variances 
or higher-order statistics can be used in these analyses. In this section we highlight 
two analyses in the frequency domain that demonstrate the important role of relatively 
5 uncertain forcings in assessing skill. 

In Fig. 11, we show the maximum-entropy method (MEM) spectra (using 30 poles) 
for the NH mean land surface temperature over 8 last-millennium simulations with the 
GISS-E2-R model that were run with different combinations of plausible solar, volcanic 
and land use forcings (Schmidt et al., 2011, 2012). The spectra are similar for mod- 
io els that have the same volcanic forcing, and significantly different when the volcanic 
forcing is derived from a different dataset or where there is no volcanic forcing at all. 
Specifically, interannual to multi-decadal variability is much larger when volcanoes are 
imposed, and the larger the volcanic forcing, the greater the variability, with the largest 
response in simulations using the Gao et al. (2008) reconstruction, compared to the 
is Crowley et al. (2008) reconstruction. In contrast, the difference between two different 
solar forcings (Vieira et al., 201 1 ; Steinhilber et al., 2009) is not detectable in this met- 
ric. (Note that the implementation of the Gao et al., 2008, volcanic forcing in these 
simulations was mis-specified and gave roughly twice the expected radiative forcing. 
Although part of the increase in variance seen here was unanticipated, given the un- 
20 certainties in specifying the forcing, the exercise is useful in highlighting the role of the 
forcings in determining variance.) 

Another analysis in the spectral domain is one focused on power law scaling (Lovejoy 
and Schertzer, 1986). Several scaling studies of GCMs demonstrate that they gener- 
ally simulate the statistics (including spectral scaling exponents) reasonably well up to 
25 « 1 0 yr scales (e.g. Fraedrich and Blender, 2003; Zhu et al., 2006; Rybski et al., 2008; 
Lovejoy and Schertzer, 2012; Vyushin et al., 2012). This already gives us confidence in 
the decadal scale responses of GCM. However, tests at lower frequencies will depend 
on solar and volcanic forcings as well as the possible impacts of slow processes such 
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as deep ocean or land-ice dynamics which are currently missing or perhaps poorly 
represented in the models. 

Following Lovejoy et al. (2012), we calculate the Root Mean Square (RMS) fluctua- 
tion as a function of time-scale, from months to centuries, for the NH land temperatures 
5 using the same eight runs of the GISS-E2-R model used above for the period 1500- 
1 900 CE. Since simulations are strongly clustered according to changes in the volcanic 
forcing used (Fig. 11), for simplicity we averaged over the three GRE and three CEA 
volcanic and the two no-volcanic runs. 

o 

For comparison, we show the mean of the same metric from three multiproxy recon- 
io structions (Huang, 2004; Moberg et al., 2005; Ljundqvist, 2010). The multiproxy aver- 
age is processed with and without the 20th Century to indicate the importance of that 
period for the scaling behaviour - in all cases the variance in the multi-decadal to cen- 
tury scale is greatly enhanced by the recent anthropogenic trend. These curves show 
fluctuations decreasing with scale over the low frequency weather regime (months to 
is decades) but increasing in the climate regime (decades to centuries). 

The comparison with the GISS-E2-R simulations is illuminating. First, we note that 
at the decadal scale, the sign of the all the slopes changes. However, the simulations 
vary in the opposite direction from the data: first growing and then decreasing with 
scale. Only the volcano-free runs (bottom) qualitatively follow the reconstructions by 
20 first decreasing and then increasing with scale. When compared to the surface data 
and multiproxy reconstructions we see that at ~ 10 yr, the simulations have variance 
that is too large while at longer scales (> lOOyr) the variance is too small. 

These results demonstrate clear mismatches in behaviour between the models’ sim- 
ulated variance at different scales and the inferred variability from multi-proxy recon- 
25 structions. However, there are strong sensitivities to the (uncertain) external forcing 
functions, precluding a straightforward attribution of the mismatch to potentially mis- 
specified forcings, missing mechanisms, insufficient “slow” variability or data problems 
in the reconstructions. 
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5.3 Hydroclimate divergence 

O 

C 

Distinct from temperature, hydroclimate variability can be quantified using a range of 
variables, including precipitation, soil moisture, lake levels, or other synthetic indices 
(e.g. Nigam and Ruiz-Barradas, 2006). Most models provide output for these diagnos- 
5 tics, but often these variables are not available directly from paleo-climate archives, 
creating a challenge when conducting model-data comparisons. However, calibrations 
of networks of precipitation sensitive tree ring widths have been used to reconstruct the 
Palmer Drought Severity Index (PDSI) in North America and Asia over the Common Era 
(Cook et al., 2004, 201 0). PDSI is calculated using temperatu rederived estimates of the 
io evapo-transpiration and precipitation, and nominally represents a normalized index of 
soil moisture, with negative values indicating drought and positive values indicate wet- 
ter than normal conditions. There are many outstanding issues with using variations of 
the index globally to assess drought, in definition and availability and quality of inputs 
and sensitivity (e.g. contrast Sheffield et al., 2012; Dai, 2012). However, we focus here 
is on the question of how well does this index, if derived from GCM output, reflect actual 
model soil moisture and whether this relationship changes over time. 

From two GCMs (GISS-E2-R and MIROC-ESM), we calculated PDSI using model 
temperature and precipitation (the Thornthwaite method) and compared this index 
against the standardized (zero mean, unit standard deviation, over the 1850-1950 pe- 
20 riod, 1 0 yr smoothing) total column soil moisture model output for the Central Plains of 
North America (105° W-90° W; 32°W-48° W) (Fig. 13). Prior to the start of the indus- 
trial period in 1850, PDSI and soil moisture track each other closely in both models 
(GISS-E2-R: r = 0.82; MIROC-ESM: r = 0.50). Beginning near the middle of the twen- 
tieth century, however, the two indices begin to diverge dramatically. In one model 
25 (GISS-E2-R) the correlation weakens considerably ( r = 0.33), while in the other model 
(MIROC-ESM) the sign of the correlation actually reverses ( r = -0.29). 

PDSI changes over the twenty-first century indicate severe and unprecedented 
drought, in contrast to the model soil moisture trends, which indicate a modest shift 
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towards drying (GISS-E2-R) or even wetter conditions over the coming decades 
(MIROC-ESM). The reason for this divergence is in the treatment of evapotranspi ra- 
tion (ET) in the model soil moisture versus in the PDSI (Thornthwaite) calculation. In 
this PDSI calculation, temperature is used as a proxy for the energy available while in 
5 the GCMs the soil energy and moisture budgets are calculated directly using explicit 
physical models. In reality, Thornthwaite ET becomes increasingly decoupled from tem- 
perature as the temperature increases, a factor reflected in the model soil moisture but 
not in the PDSI index. For time periods with strong transient forcing in temperature 
(e.g., the late twentieth century and into the future), our analysis suggests that the 
10 usefulness of PDSI for diagnosing drought and hydroclimate trends is limited. This 
suggests caution should be used when trying to convert projected variables to those 
defined from the paleoclimate record. 

CD 

■o 

CD 

—i 

6 Conclusions and recommendations 

In this paper, we have focused the opportunities provided by inclusion of “out-of- 
is sample” paleo-climate experiments within the CMIP5 framework, and specifically how 
measures of skill in modelling paleo-climate change might inform future projections of 
climate change. 

We have shown that some relationships are robust across the ensemble of models, 
simulations and paleo-data (Sect. 3) and furthermore that there are skill measures that 
20 are well correlated to the simulated magnitude of future change, thus allowing the likely 
magnitude of future changes to be constrained (Sect. 4). However, there is a need for 
caution because of the limitations with models, the experimental setup used in CMIP5, 
or with the paleo-climate data itself (Sect. 5). 

Our examples suggest that there are some general requirements for attempts to 
25 use the paleo-climate simulations to quantitatively constrain future projections. Each 
example makes use of a specific target (or targets) from a paleo-climate reconstruction 
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of change, defines a metric of skill that quantifies the accuracy of the modeled changes 
and assesses the connection to a future prediction. We recommend that ideally: 

C/) 

- paleo-data targets be spatially representative synthesis products with well- 
characterised uncertainties, 

"O 

- the chosen metrics shouqld be robust to uncertainties in external forcing, 

- the chosen metrics should not be overly sensitive to the model representation of 
key phenomena, and are within the scope of the modelled system, 

O 

- any relationship between the targets in the past and the future predictions should 
be examined, and not simply assumed. 

Under these conditions, the likelihood of a significant constraint is much greater. 

We underline the need for paleo-simulations to be performed with models that are 
also being used for future projections and that model diagnostics are commensurate 
(see also Schmidt, 2012). Although the robustness of some of our analyses is limited by 
the small number of paleo-simulations currently available in the CMIP5 database, we 
hope that the demonstration of their potential to address questions relevant to the future 
should encourage other modeling groups to complete and archive these simulations. 

There are also important lessons here for the paleo-data community. Our analyses 
rely heavily on the use of synthesis data products, for instance the MARGO dataset for 
the LGM (MARGO, 2009), pollen-based reconstructions for the Mid-Holocene (Bartlein 
et al., 2011), multi-proxy reconstructions of hemispheric temperature (e.g. Moberg 
et al., 2005), or gridded tree-ring based reconstructions of PDSI for the last mil- 
lennium (Cook et al., 2010). Such products are invaluable, but there is a need for 
increased transparency of included uncertainties and continued expansion e.g. see 
Muller et al. (2011) for sea ice extent. Increasing model complexity, for instance by 
including a carbon cycle, fire models or online tracers such as water isotopes, necessi- 
tates the creation of new syntheses (e.g. charcoal records: Daniau et al., 2012; or sea 
surface carbonate isotopes: Oppo et al., 2007). 

804 


CPD 

9, 775-835, 2013 


Using paleo-climate 
comparisons to 
constrain future 
projections in CMIP5 

G. A. Schmidt et al. 


Title Page 


Abstract H Introduction 


Conclusions H References 


Tables H Figures 



Full Screen / Esc 


Printer-friendly Version 
Interactive Discussion 




















The periods and experiments chosen in paleo-climate experiments are far more lim- 
ited than the number of interesting features in the paleo-climate record. The three pe- 
riods selected for CMIP5 were chosen on the basis of their relative maturity (the exis- 
tence of prior sets of experiments, already tested issues, existing data syntheses), but 
5 additional periods are also potentially useful - the mid-Pliocene (2.5 million yr ago), the 
8.2 kyr event, the last interglacial, the peak Eocene etc. (see Schmidt, 2012 for justifi- 
cations). Some of these periods are already being examined in a coordinated fashion 
(e.g. Haywood et al., 2012, and Dolan et al., 2012, for the Pliocene), and it is to be 
hoped that more will be started. Further expansion of the model experiments will in- 
io creasingly produce higher frequency diagnostics (daily and sub-daily variations), and 
perturbed physics ensembles, to better characterise the model structural uncertainty. 
Both of these expansions will create possibilities for more, and better, tests of model 
performance. In the meantime, there is already a huge scope for more informative 
comparisons that can be made using the existing databases. 
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Table 1. List of models, institutions and experiments used in the analyses in this paper. Ex- 
periment names use the CMIP5 database shorthand, and run numbers are the “rip” coding for 
each experiment. 


CPD 

9, 775-835, 2013 


Model Name 

Model Institution 

Experiments 

Run numbers 

ACCESS-1.0 

CSIRO (Commonwealth Scientific and Industrial Research Organisation, Aus- 
tralia), and BOM (Bureau of Meteorology, Australia) 

Historical 

rlilpl 

BCC-CSM1 

Beijing Climate Center, China Meteorological Administration, China 

piControl 

rlilpl 



midHolocene 

rlilpl 



rcp85 

rlilpl 

CanESM2 

Canadian Centre for Climate Modelling and Analysis, Canada 

historical 

r[1-5]i1p1 



rcp45 

r[1-5]i1p1 

CNRM-CM5 

Centre National de Recherches Meteorologiques / Centre Europeen de Recherche 
et Formation Avancee en Calcul Scientifique, France 

piControl 

rlilpl 



historical 

r[1 -1 0]i1 pi 



midHolocene 

rlilpl 



Igm 

rlilpl 



1 pctC02 

rlilpl 



abrupt4xC02 

rlilpl 



rcp45 

rlilpl 



rcp85 

rlilpl 

CSIRO-Mk3-6-0 

Commonwealth Scientific and Industrial Research Organisation in collaboration 
with the Queensland Climate Change Centre of Excellence, Australia 

piControl 

rlilpl 



historical 

r[1 -1 0]i1 pi 



midHolocene 

rlilpl 



rcp45 

r[1 -1 0]i1 pi 



rcp85 

rlilpl 

EC-EARTH 

EC-Earth consortium 

piControl 

rlilpl 



historical 

r7i1p1 



midHolocene 

rlilpl 



rcp45 

r[1 ,2,6-9,1 1,12, 
1 4]i1 pi 



rcp85 

rlilpl 

FGOALS-g2 

LASG, Institute of Atmospheric Physics, Chinese Academy of Sciences; and 
CESS, Tsinghua University, China 

piControl 

rlilpl 



midHolocene 

rlilpl 



rcp85 

rlilpl 

GFDL-CM2.1 

NOAA Geophysical Fluid Dynamics Laboratory, US 

historical 

rlilpl 

GFDL-ESM2G 

NOAA Geophysical Fluid Dynamics Laboratory, US 

piControl 

rlilpl 



historical 

rlilpl 



midHolocene 

rlilpl 



rcp85 

rlilpl 

GFDL-ESM2M 

NOAA Geophysical Fluid Dynamics Laboratory, US 

piControl 

rlilpl 



historical 

rlilpl 



midHolocene 

rlilpl 



rcp85 

rlilpl 

GISS-E2-H 

NASA Goddard Institute for Space Studies, US 

piControl 

rlilpl 



historical 

r[1-5]i1p[12] 
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Table 1. Continued. 


Model Name 

Model Institution 

Experiments 

Run numbers 

GISS-E2-R 

NASA Goddard Institute for Space Studies, US 

piControl 

rlilpl, rl il pi 41 



historical 

r1i1p[12], r[45]i1p3 



pastl 000 

r1i1p12[1-8] 



midHolocene 

rlilpl 



Igm 

r1i1p15[01] 



1 pctC02 

rlilpl 



abrupt4xC02 

rlilpl 



rcp45 

r[1-5]i1p1 



rcp85 

rlilpl 

HadCM3 

Hadley Center, UK Met. Office, UK 

historical 

r[1 -1 0]i1 pi 

HadGEM2-CC 

Hadley Center, UK Met. Office, UK 

piControl 

rlilpl 



historical 

rlilpl 



midHolocene 

rlilpl 



rcp45 

rlilpl 



rcp85 

rlilpl 

HadGEM2-ES 

Hadley Center, UK Met. Office, UK 

piControl 

rlilpl 



historical 

rlilpl 



midHolocene 

rlilpl 



rcp45 

r[1-3]i1p1 



rcp85 

rlilpl 

INM-CM4 

Institute for Numerical Mathematics, Russia 

piControl 

rlilpl 



historical 

rlilpl 



midHolocene 

rlilpl 



rcp45 

rlilpl 



rcp85 

rlilpl 

IPSL-CM5A-LR 

Institut Pierre-Simon Laplace, France 

piControl 

rlilpl 



historical 

r[1-4]i1p1 



midHolocene 

rlilpl 



Igm 

rlilpl 



1 pctC02 

rlilpl 



abrupt4xC02 

rlilpl 



rcp45 

r[1-4]i1p1 



rcp85 

rlilpl 

IPSL-CM5A-MR 

Institut Pierre-Simon Laplace, France 

piControl 

rlilpl 



historical 

rlilpl 



midHolocene 

rlilpl 



rcp45 

rlilpl 



rcp85 

rlilpl 

MIROC-ESM 

Japan Agency for Marine-Earth Science and Technology, Atmosphere and Ocean 

piControl 

rlilpl 


Research Institute (The University of Tokyo), and National Institute for Environ- 
mental Studies, Japan 

midHolocene 

rlilpl 



Igm 

rlilpl 



pastl 000 

rlilpl 



1 pctC02 

rlilpl 



abrupt4xC02 

rlilpl 



rcp85 

rlilpl 
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Table 1. Continued. 
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Model Name 

Model Institution 

Experiments 

Run numbers 

MIROC5 

Atmosphere and Ocean Research Institute (The University of Tokyo), National In- 

piControl 

rlilpl 


stitute for Environmental Studies, and Japan Agency for Marine-Earth Science and 
Technology, Japan 

historical 

rlilpl 



midHolocene 

rlilpl 



rcp45 

r[1-3]i1p1 



rcp85 

rlilpl 

MPI-ESM-P 

Max Planck Institute for Meteorology, Hamburg, Germany 

piControl 

rlilpl 



historical 

rlilpl 



pastl 000 

rlilpl 



Igm 

rlilpl 



midHolocene 

rlilpl 



1 pctC02 

rlilpl 



abrupt4xC02 

rlilpl 



rcp85 

rlilpl 

MPI-ESM-LR 

Max Planck Institute for Meteorology, Hamburg, Germany 

piControl 

rlilpl 



historical 

r[1-3]i1p1 



rcp85 

rlilpl 

MRI-CGCM3 

Meteorological Research Institute, Tsukuba, Japan 

piControl 

rlilpl 



midHolocene 

rlilpl 



Igm 

rlilpl 



1 pctC02 

rlilpl 



abrupt4xC02 

rlilpl 



rcp85 

rlilpl 

NCAR-CCSM4 

National Center for Atmospheric Research, US 

piControl 

rlilpl 



midHolocene 

rlilpl 



Igm 

rlilpl 



1 pctC02 

rlilpl 



abrupt4xC02 

rlilpl 



rcp45 

rlilpl 



rcp85 

rlilpl 

NCAR-CESM1 

National Center for Atmospheric Research, US 

historical 

r[5-7]i1p1 

NorESMI-M 

Norwegian Climate Centre, Norway 

piControl 

rlilpl 



historical 

r[1-3]i1p1 



midHolocene 

rlilpl 



rcp45 

rlilpl 



rcp85 

rlilpl 
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Table 2. Correlation of different variables with future precipitation change in the RCP 8.5 
scenario. Precipitation changes are defined as in Fig. 4: annual-mean precipitation averaged 
over 0°-8° N-60° W-50° W minus annual-mean precipitation averaged over 5° S-1 5° S-45° W- 
35° W; SST dipole changes are defined as the annual-mean change in SST averaged over 
3° N-1 5° N-50° W-20° W minus the annual-mean change in SST averaged over 3° S-1 5° S- 
30° W-20° W (see boxes on Fig. 4b). Land surface warming is the annual-mean warming aver- 
aged over 1 5° S-0°-70° W-50° W. The double ITCZ index is defined as the annual-mean pre- 
cipitation averaged over the Southern branch (7° S-3° S-35° W-20° W) minus the annual-mean 
precipitation averaged over the Northern branch (7° N-3° N-35° W-20° W). 


Variable 

correlation (r) 

No. of models 

p-value 

midHolocene-piControl: A precip 

0.93 

9 

0.0001 

rcp85-piControl: A SST dipole 

0.67 

16 

0.002 

midHolocene-piControl: A SST dipole 

-0.08 

9 

Not significant 

amipFuture-amip: A precip 

0.78 

5 

0.06 

sstClim4xC02-sstClim: A precip 

0.92 

7 

0.002 

sstClim4xC02-sstClim: A SAT (land) 

0.71 

8 

0.024 

midHolocene-piControl: A SAT (land) 

0.73 

9 

0.013 

double ITCZ index in piControl 

-0.66 

16 

0.003 
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Fig. 1. Average regional temperature changes vs. global temperature changes for the glacial (in 
blue, the pre-industrial - LGM difference is shown), years 1 20 to 1 40 of the 1 pctC02 simulations 
(in yellow, in comparison to piControl) and the years 1 00 to 1 50 of the abrupt 4xC02 simulations 
(in red, in comparison to piControl). For the bottom plots, the model averages are taken only 
from grid-boxes that correspond to proxy data sites within the defined region (reconstruction 
range shown in blue shading). Definition of the regions: Tropics: 23°S-23°N, North Atlantic 
Europe: 45° W-45° E, 30-50° N, East Antarctica: 5° W-165° E, 70-80° S. The results have been 
computed for all models in the database on 23 July 2012 for which there were results for the 
lgm, piControl, 1pctC02 and abrupt4xC02 simulations. Reconstructions are from the MARGO 
(2009) and Bartlein et al. (2011). 
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> CNRM_CM5 lgm 
A MPI-ESM-P lgm 

< MIROC-ESM lgm 
O GISS-E2-R2 lgm 

□ IPSLCM5A-LR lgm 
O MRI-CGCM3 lgm 
O GISS-E2-R1 lgm 

V NCAR-CCSM4 lgm 

> CNRM_CM5 lpctC02 
A MPI-ESM-P lpctC02 

< MIROC-ESM lpctC02 
O GISS-E2-R2 lpctC02 

□ IPSLCM5A-LR lpctC02 
O MRI-CGCM3 lpctC02 
O GISS-E2-R1 lpctC02 

V NCAR-CCSM4 lpctC02 
CNRM.CM5 abrupt4xC02 

A MPI-ESM-P abrupt4xC02 
<J MIROC-ESM abrupt4xC02 
O GISS-E2-R2 abrupt4xC02 
■ IPSLCM5A-LR abrupt4xC02 
O MRI-CGCM3 abrupt4xC02 
^ GISS-E2-R1 abrupt4xC02 

V NCAR-CCSM4 abrupt4xC02 
♦ reconstructions 

Fig. 2. Average surface air temperature change, compared to piControl, over land compared 
to over the oceans for the North Atlantic and Europe region (45° W-45° E, 30°-50° N) and the 
tropics (23°S-23°N). LGM - piControl in blue, 1pctC02 - piControl in orange, abrupt4xC02 
- piControl in red. For the latter 2 periods, the averages have been computed over the same 
years as Fig. 1 above. The results have been computed for all models in the database on 23 
July 2012 for which there were results for the lgm, piControl, 1pctC02 and abrupt4xC02. The 
grey lines indicate the 1 : 1 .5 ratio in both plots. Reconstructions are as in Fig. 1 . 
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Fig. 3. Illustration of quantile regressions between the percentage of summer hot days in Eu- 
rope (i.e. exceeding the 90th quantile of daily mean temperature in June-July-August) and the 
precipitation frequency anomaly with respect to the mean in winter-spring (January to May). The 
precipitation frequency is computed over southern Europe (36° N to 46° N) and is defined as 
the percentage of days with precipitation exceeding 0.5 mm. The quantile regressions are com- 
puted for the 1 0th and 90th quantiles of the hot day frequency, following Quesada et al. (201 2). 
(a) shows the quantile regression for Western Europe from the EOBS dataset (Haylock et al., 
2008) between 1950 and 2011 where each point represents a year, (b) is for the “historical” 
simulation (1960-2008) of the IPSL-CM5A-MR model (Dufresne et al., 2013). Both panels 
show a widening of the quantile regression for low values of precipitation frequency, indicating 
a consistency of the model simulation with observations. 
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in precipitation change (mm/d) 


Fig. 4. (a) Relationship between the precipitation dipole change from pre-industrial to future cli- 
mate under RCP 8.5 for the 2080-2100 and the precipitation dipole change from pre-industrial 
to mid-Holocene. Only those models within each group that had both rcp85 and midHolocene 
data available at the time are plotted, other models that provided only rcp85 data are listed for 
completeness, (b) Maps of precipitation changes from piControl to rcp85 (top) and from piCon- 
trol to midHolocene (bottom) in average over all available models in group 1 (left) and in group 
3 (right). Contours show corresponding SST changes. The boxes over land and ocean show 
the areas used in the dipole definitions. 
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Fig. 5. (a) Global mean LGM temperature change versus overall climate sensitivity to 2xC02. 
(b) Correlation between local air temperature anomaly and climate sensitivity across the model 
ensemble, (c) Correlation across the model ensemble between control run temperatures and 
climate sensitivity. 
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Using LGM Tropical temperature as a constraint on climate sensitivity 



Fig. 6. Using LGM Tropical temperature as a constraint on climate sensitivity. Cyan and blue 
dots represent PMIP2 and CMIP5 simulations. Linear correlation and predictive uncertainty 
range are plotted as solid and dashed blue lines respectively. Small red dots represent a Monte 
Carlo sample from the estimated proxy-derived observational value, mapped onto the climate 
sensitivity. 
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Climate sensitivity estimated through weighting of the PMIP models 
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Fig. 7. Climate sensitivity estimated through weighting of the PMIP models. Cyan and blue 
dots represent PMIP2 and CMIP5 simulations. Green curve shows prior distribution of climate 
sensitivity (based on equal weighting of the models). Red curve shows posterior distribution, 
after weighting according to match to the LGM tropical temperature. Vertical bars indicate 5- 
95 % ranges. 
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a) 



Fig. 8. Sea-ice extent in CMIP5 models in 10 6 km 2 . (a) 30-yr mean seasonal cycle for the 
historical period (1870-1900), (b) the anomaly in sea ice extent for the period 2036-2065 in 
RCP 8.5, and (c) the anomaly at the mid-Holocene. 
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Fig. 9. Relationship between the September MH anomaly and the September RCP 8.5 anomaly 
across the CMIP5 models. 
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Fig. 10. Upper panel: Magnitude of the trend in <5 1S 0 C (%o/decade, computed from a simple 
linear regression through the trend PC) in corals (far left), Simple Ocean Data Assimilation 
(SODA) 20th-century reanalysis (Carson and Giese, 2008; Giese and Ray, 201 1 ; Compo et al., 
2011), a 500-yr control run from GFDL CM2.1 (Wittenberg, 2009), and the CMIP3 and CMIP5 
multi-model ensembles. In each case, 5 18 O c was modeled from SST and SSS (1 ), SST only (2), 
and SSS (3). Lower Panel: Magnitude of the <5 18 O c trend (%o/decade, computed from a sim- 
ple linear regression through the trend PC) over 1890-1990 in pseudocorals modeled from 
CMIP5 historical simulations and over 2006-2100 in the RCP 4.5 projections where numbers 
in parenthesis indicate the number of runs in the historical and RCP4.5 ensemble, respectively. 
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Spectra of NH Land Surface Temperatures for the Last Millennium 
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Fig. 11. Spectra from an ensemble of LM simulations using the same model but driven with 
different sets of forcings compared with Ljundqvist (2010), Mann et al. (2008) and Moberg 
et al. (2006) reconstructions. The clustering of simulations is driven entirely by changes in the 
volcanic forcing dataset used, with the simulations with the most decadal and multi-decadal 
variability using the Gao et al. (2008) reconstruction. Only in the examples where no volcanic 
forcing is used at all is the impact of different solar forcing reconstructions detectable. Spectra 
derived using MEM with 30 poles, from 850 to 2005, after correction for control run drift using 
a loess low-frequency estimate derived from the control run. Key abbreviations: Land use: Pnz 
(Pongratz et al., 2008), Kap (Kaplan et al., 2011); Solar: Krv (Vieira et al., 2011), Stn (Steinhilber 
et al., 2009); Volcanic: Gao (Gao et al., 2008), Crw (Crowley et al., 2008). 
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Fig. 12. RMS fluctuations of instrumental and paleoclimate reconstructions compared to sim- 
ulations of the Northern Hemisphere land temperature for the period 1500-1900. GRE, CEA o 
refer to GISS-E2-R simulations using the Gao et al., (2008), and Crowley et al. (2008) re- 
constructions of volcanic forcing. The multiproxy reconstruction used is an average of three 
NH estimates, and the RMS fluctuations are separately shown for the periods 1000-1900 and o' 
1000-1980. 
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GISS-E2-R 



Fig. 13. Standardized anomalies for PDSI and soil moisture in two models (GISS-E2-R and 
MIROC-ESM) using a pastlOOO simulation, and a historical+rcp85 continuation. For reference, 
the tree-ring based reconstruction is plotted (dashed-line) (Cook et al., 201 0), though this would 
not be expected to line up exactly with the model simulations. All data smoothed with a 10-yr 
running mean. 
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